Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

نویسندگان

Francis R. Bach

Eric Moulines

چکیده

We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for function values of O(1/ √ n) after n iterations. We consider and analyze two algorithms that achieve a rate of O(1/n) for classical supervised learning problems. For least-squares regression, we show that averaged stochastic gradient descent with constant step-size achieves the desired rate. For logistic regression, this is achieved by a simple novel stochastic gradient algorithm that (a) constructs successive local quadratic approximations of the loss functions, while (b) preserving the same running-time complexity as stochastic gradient descent. For these algorithms, we provide a non-asymptotic analysis of the generalization error (in expectation, and also in high probability for least-squares), and run extensive experiments showing that they often outperform existing approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

[hal-00831977, v1] Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

متن کامل

Stochastic Coordinate Descent for Nonsmooth Convex Optimization

Stochastic coordinate descent, due to its practicality and efficiency, is increasingly popular in machine learning and signal processing communities as it has proven successful in several large-scale optimization problems , such as l1 regularized regression, Support Vector Machine, to name a few. In this paper, we consider a composite problem where the nonsmoothness has a general structure that...

متن کامل

Fast Convergence of Stochastic Gradient Descent under a Strong Growth Condition

We consider optimizing a function smooth convex function f that is the average of a set of differentiable functions fi, under the assumption considered by Solodov [1998] and Tseng [1998] that the norm of each gradient f ′ i is bounded by a linear function of the norm of the average gradient f . We show that under these assumptions the basic stochastic gradient method with a sufficiently-small c...

متن کامل

VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning

In this paper, we propose a simple variant of the original SVRG, called variance reduced stochastic gradient descent (VR-SGD). Unlike the choices of snapshot and starting points in SVRG and its proximal variant, Prox-SVRG, the two vectors of VR-SGD are set to the average and last iterate of the previous epoch, respectively. The settings allow us to use much larger learning rates, and also make ...

متن کامل

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization problems which arise in machine learning. For strongly convex problems, its convergence rate was known to be O(log(T )/T ), by running SGD for T iterations and returning the average point. However, recent results showed that using a different algorithm, one can get an optimal O(1/T ) rate. This mig...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

نویسندگان

چکیده

منابع مشابه

[hal-00831977, v1] Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

Stochastic Coordinate Descent for Nonsmooth Convex Optimization

Fast Convergence of Stochastic Gradient Descent under a Strong Growth Condition

VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

عنوان ژورنال:

اشتراک گذاری